{
  "cells": [
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "ur8xi4C7S06n"
      },
      "outputs": [],
      "source": [
        "# Copyright 2024 Google LLC\n",
        "#\n",
        "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "#     https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "JAPoU8Sm5E6e"
      },
      "source": [
        "# Getting Started with Translation\n",
        "\n",
        "<table align=\"left\">\n",
        "  <td style=\"text-align: center\">\n",
        "    <a href=\"https://colab.research.google.com/github/GoogleCloudPlatform/generative-ai/blob/main/translation/intro_translation.ipynb\">\n",
        "      <img width=\"32px\" src=\"https://www.gstatic.com/pantheon/images/bigquery/welcome_page/colab-logo.svg\" alt=\"Google Colaboratory logo\"><br> Open in Colab\n",
        "    </a>\n",
        "  </td>\n",
        "  <td style=\"text-align: center\">\n",
        "    <a href=\"https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fgenerative-ai%2Fmain%2Ftranslation%2Fintro_translation.ipynb\">\n",
        "      <img width=\"32px\" src=\"https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN\" alt=\"Google Cloud Colab Enterprise logo\"><br> Open in Colab Enterprise\n",
        "    </a>\n",
        "  </td>\n",
        "  <td style=\"text-align: center\">\n",
        "    <a href=\"https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/main/translation/intro_translation.ipynb\">\n",
        "      <img src=\"https://www.gstatic.com/images/branding/gcpiconscolors/vertexai/v1/32px.svg\" alt=\"Vertex AI logo\"><br> Open in Vertex AI Workbench\n",
        "    </a>\n",
        "  </td>\n",
        "  <td style=\"text-align: center\">\n",
        "    <a href=\"https://github.com/GoogleCloudPlatform/generative-ai/blob/main/translation/intro_translation.ipynb\">\n",
        "      <img width=\"32px\" src=\"https://raw.githubusercontent.com/primer/octicons/refs/heads/main/icons/mark-github-24.svg\" alt=\"GitHub logo\"><br> View on GitHub\n",
        "    </a>\n",
        "  </td>\n",
        "</table>\n",
        "\n",
        "<div style=\"clear: both;\"></div>\n",
        "\n",
        "<b>Share to:</b>\n",
        "\n",
        "<a href=\"https://www.linkedin.com/sharing/share-offsite/?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/translation/intro_translation.ipynb\" target=\"_blank\">\n",
        "  <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/8/81/LinkedIn_icon.svg\" alt=\"LinkedIn logo\">\n",
        "</a>\n",
        "\n",
        "<a href=\"https://bsky.app/intent/compose?text=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/translation/intro_translation.ipynb\" target=\"_blank\">\n",
        "  <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/7/7a/Bluesky_Logo.svg\" alt=\"Bluesky logo\">\n",
        "</a>\n",
        "\n",
        "<a href=\"https://twitter.com/intent/tweet?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/translation/intro_translation.ipynb\" target=\"_blank\">\n",
        "  <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/5/5a/X_icon_2.svg\" alt=\"X logo\">\n",
        "</a>\n",
        "\n",
        "<a href=\"https://reddit.com/submit?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/translation/intro_translation.ipynb\" target=\"_blank\">\n",
        "  <img width=\"20px\" src=\"https://redditinc.com/hubfs/Reddit%20Inc/Brand/Reddit_Logo.png\" alt=\"Reddit logo\">\n",
        "</a>\n",
        "\n",
        "<a href=\"https://www.facebook.com/sharer/sharer.php?u=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/translation/intro_translation.ipynb\" target=\"_blank\">\n",
        "  <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/5/51/Facebook_f_logo_%282019%29.svg\" alt=\"Facebook logo\">\n",
        "</a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "adb516335e46"
      },
      "source": [
        "| Author |\n",
        "| --- |\n",
        "| [Holt Skinner](https://github.com/holtskinner) |"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "tvgnzT1CKxrO"
      },
      "source": [
        "## Overview\n",
        "\n",
        "This notebook demonstrates how to use the [Google Cloud Translation API](https://cloud.google.com/translate) to translate text in [130+ languages](https://cloud.google.com/translate/docs/languages)."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "d975e698c9a4"
      },
      "source": [
        "### Objective\n",
        "\n",
        "This tutorial uses the following Google Cloud AI services and resources:\n",
        "\n",
        "- [Cloud Translation API](https://cloud.google.com/translate/docs/overview)\n",
        "- Cloud Storage\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "aed92deeb4a0"
      },
      "source": [
        "### Costs\n",
        "\n",
        "This tutorial uses billable components of Google Cloud:\n",
        "\n",
        "* Cloud Translation\n",
        "* Cloud Storage\n",
        "\n",
        "Learn about [Translate pricing](https://cloud.google.com/translate/pricing),\n",
        "and [Cloud Storage pricing](https://cloud.google.com/storage/pricing),\n",
        "and use the [Pricing Calculator](https://cloud.google.com/products/calculator/)\n",
        "to generate a cost estimate based on your projected usage."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "i7EUnXsZhAGF"
      },
      "source": [
        "## Getting Started\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "NNSWiCNPjh_p"
      },
      "source": [
        "### Install Vertex AI SDK, other packages and their dependencies\n",
        "\n",
        "Install the following packages required to execute this notebook."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "2b4ef9b72d43"
      },
      "outputs": [],
      "source": [
        "# Install the packages\n",
        "%pip install --user --upgrade -q google-cloud-translate"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "YbMFqPZ3tnwz"
      },
      "source": [
        "Set the project and region.\n",
        "\n",
        "* Please note the **available regions** for Translation, see [documentation](https://cloud.google.com/translate/docs/advanced/endpoints)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 5,
      "metadata": {
        "id": "GjSsu6cmUdEx"
      },
      "outputs": [],
      "source": [
        "# Use the environment variable if the user doesn't provide Project ID.\n",
        "import os\n",
        "\n",
        "PROJECT_ID = \"[your-project-id]\"  # @param {type: \"string\", placeholder: \"[your-project-id]\", isTemplate: true}\n",
        "if not PROJECT_ID or PROJECT_ID == \"[your-project-id]\":\n",
        "    PROJECT_ID = str(os.environ.get(\"GOOGLE_CLOUD_PROJECT\"))\n",
        "\n",
        "LOCATION = os.environ.get(\"GOOGLE_CLOUD_REGION\", \"us-central1\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "opUxT_k5TdgP"
      },
      "source": [
        "### Authenticating your notebook environment\n",
        "\n",
        "* If you are using **Colab** to run this notebook, run the cell below and continue.\n",
        "* If you are using **Vertex AI Workbench**, check out the setup instructions [here](https://github.com/GoogleCloudPlatform/generative-ai/tree/main/setup-env)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "vbNgv4q1T2Mi"
      },
      "outputs": [],
      "source": [
        "import sys\n",
        "\n",
        "# Additional authentication is required for Google Colab\n",
        "if \"google.colab\" in sys.modules:\n",
        "    # Authenticate user to Google Cloud\n",
        "    from google.colab import auth\n",
        "\n",
        "    auth.authenticate_user()\n",
        "\n",
        "    ! gcloud config set project {PROJECT_ID}\n",
        "    ! gcloud auth application-default login -q"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "960505627ddf"
      },
      "source": [
        "### Import libraries"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 2,
      "metadata": {
        "id": "PyQmSRbKA8r-"
      },
      "outputs": [],
      "source": [
        "from google.cloud import translate"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "v4gUI8WqciKS"
      },
      "source": [
        "### Create client"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 3,
      "metadata": {
        "id": "6Pl3un_YciKS"
      },
      "outputs": [],
      "source": [
        "client = translate.TranslationServiceClient(\n",
        "    # Optional: https://cloud.google.com/translate/docs/advanced/endpoints\n",
        "    # client_options=ClientOptions(\n",
        "    #     api_endpoint=f\"translate-{TRANSLATE_LOCATION}.googleapis.com\"\n",
        "    # )\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "v5CEc4-Wrjk2"
      },
      "source": [
        "### Create helper functions"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 6,
      "metadata": {
        "id": "kYx2wwhjrmD6"
      },
      "outputs": [],
      "source": [
        "def translate_text(\n",
        "    text: str,\n",
        "    project_id: str = PROJECT_ID,\n",
        "    location: str = LOCATION,\n",
        "    glossary: str | None = None,\n",
        ") -> translate.TranslateTextResponse:\n",
        "    \"\"\"Translating Text.\"\"\"\n",
        "    # Translate text from English to Spanish\n",
        "    # Detail on supported types can be found here:\n",
        "    # https://cloud.google.com/translate/docs/supported-formats\n",
        "    response = client.translate_text(\n",
        "        request=translate.TranslateTextRequest(\n",
        "            parent=client.common_location_path(project_id, location),\n",
        "            contents=[text],\n",
        "            # Supported language codes: https://cloud.google.com/translate/docs/languages\n",
        "            source_language_code=\"en\",\n",
        "            target_language_code=\"es\",\n",
        "            glossary_config=(\n",
        "                translate.TranslateTextGlossaryConfig(glossary=glossary)\n",
        "                if glossary\n",
        "                else None\n",
        "            ),\n",
        "        )\n",
        "    )\n",
        "\n",
        "    return response\n",
        "\n",
        "\n",
        "def create_glossary(\n",
        "    input_uri: str,\n",
        "    glossary_id: str,\n",
        "    project_id: str = PROJECT_ID,\n",
        "    location: str = LOCATION,\n",
        "    timeout: int = 180,\n",
        ") -> translate.Glossary:\n",
        "    \"\"\"\n",
        "    Create a unidirectional glossary. Glossary can be words or\n",
        "    short phrases (usually fewer than five words).\n",
        "    https://cloud.google.com/translate/docs/advanced/glossary#format-glossary\n",
        "    \"\"\"\n",
        "    glossary = translate.Glossary(\n",
        "        name=client.glossary_path(project_id, location, glossary_id),\n",
        "        # Supported language codes: https://cloud.google.com/translate/docs/languages\n",
        "        language_pair=translate.Glossary.LanguageCodePair(\n",
        "            source_language_code=\"en\", target_language_code=\"es\"\n",
        "        ),\n",
        "        input_config=translate.GlossaryInputConfig(\n",
        "            gcs_source=translate.GcsSource(input_uri=input_uri)\n",
        "        ),\n",
        "    )\n",
        "\n",
        "    # glossary is a custom dictionary Translation API uses\n",
        "    # to translate the domain-specific terminology.\n",
        "    operation = client.create_glossary(\n",
        "        parent=client.common_location_path(project_id, location), glossary=glossary\n",
        "    )\n",
        "\n",
        "    result = operation.result(timeout)\n",
        "    return result"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "a93a71e75b78"
      },
      "source": [
        "Now let's try to translate a simple phrase from English to Spanish."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 11,
      "metadata": {
        "id": "2e8e84e860db"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Translated text: ¡Hola!\n"
          ]
        }
      ],
      "source": [
        "response = translate_text(\"Hi there!\")\n",
        "\n",
        "# Display the translation for each input text provided\n",
        "for translation in response.translations:\n",
        "    print(f\"Translated text: {translation.translated_text}\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "6a84167808aa"
      },
      "source": [
        "## Glossaries"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "bd76360f8068"
      },
      "source": [
        "That looks great! However, let's look at what happens if we try to translate a technical word, such as the Google Cloud product [Compute Engine](https://cloud.google.com/compute?hl=en)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 21,
      "metadata": {
        "id": "894401e53921"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Translated text: Motor de Computación\n"
          ]
        }
      ],
      "source": [
        "response = translate_text(\"Compute Engine\")\n",
        "\n",
        "# Display the translation for each input text provided\n",
        "for translation in response.translations:\n",
        "    print(f\"Translated text: {translation.translated_text}\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "6760e9c66d88"
      },
      "source": [
        "### Create a Glossary"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "120088b42a11"
      },
      "source": [
        "Notice that the Translation API translated the name literally.\n",
        "\n",
        "Suppose we want this name to be the same in all languages, we can create a [Glossary](https://cloud.google.com/translate/docs/advanced/glossary) to consistently translate domain-specific words and phrases.\n",
        "\n",
        "Next, we'll create a glossary for lots of Google Cloud product names to indicate how they should be translated into Spanish.\n",
        "\n",
        "We've already created an input TSV file and uploaded it to a publicly-accessible Cloud Storage bucket."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 32,
      "metadata": {
        "id": "25714948fd80"
      },
      "outputs": [],
      "source": [
        "glossary = create_glossary(\n",
        "    input_uri=\"gs://github-repo/translation/GoogleCloudGlossary.tsv\",\n",
        "    glossary_id=\"google_cloud_english_to_spanish\",\n",
        ")\n",
        "print(glossary)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "059abb41d254"
      },
      "source": [
        "Now, let's try translating the text again using the glossary."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 10,
      "metadata": {
        "id": "0c9e51afa18f"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Default Translated text: Motor de Computación\n",
            "Glossary Translated text: Compute Engine\n"
          ]
        }
      ],
      "source": [
        "response = translate_text(\"Compute Engine\", glossary=glossary.name)\n",
        "\n",
        "# Display the translation for each input text provided\n",
        "for translation in response.translations:\n",
        "    print(f\"Default Translated text: {translation.translated_text}\")\n",
        "\n",
        "for translation in response.glossary_translations:\n",
        "    print(f\"Glossary Translated text: {translation.translated_text}\")"
      ]
    }
  ],
  "metadata": {
    "colab": {
      "name": "intro_translation.ipynb",
      "toc_visible": true
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
